Reply to Prof. Pearsons criticisms 

BY 

J. C. KAPTEYN. 


In N os 1 and 2 of Biometrika, Yol. IV, Prof. Pearson 
referring to my paper „Skew frequency curves in biology 
and statistics” ’): 

1 st - maintains that in my theory I have followed Ed¬ 
geworth without acknowledging his priority; 

2 nd • refutes my or Edgeworth’s theory. 

As to the first point: I must plead guilty in part and 
I offer Prof. Edgeworth my apologies. I confess to have 
overlooked his papers. I may perhaps adduce as an atte¬ 
nuating circumstance that these papers have been also 
overlooked in the bibliographies of both Prof. L U|d w i g and 
of Davenport, the only bibliographies on the subject with 
which I am acquainted. 

On my request Prof. Edgeworth kindly sent me a re¬ 
print of his papers in the Journal of the statistical society. 
In Vol. 61 Part 4 the author, in accordance with what is 
said on pages 10—12 of my paper, remarks in substance 
as follows: 

When a certain character (x) is distributed according to 

1) P. N o o r d h o f f, Groningen, 1903. 
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a normal frequency curve, then other characters propor¬ 
tional to any function <p (x) of that character will be 
generally distributed according to an asymmetrical fre¬ 
quency curve. 

This remark is undoubtedly correct and, rightly translated 
into a mathematical formula, would lead to the following 
equation 


(1) 


* = Pi **<*> e 


— h* [F[x) — M]* 


F (x) being the solution for s of the equation x — rp ( z ). 

Now this equation is no other than the fundamental 
equation at which I arrived in my paper (p. 16). 

If, notwithstanding this, I still feel justified in claiming 
my part in the ownership of this formula, it is on the 
ground that Prof. Edgeworth’s remark, correct though it 
be, is still not equivalent to a general theory. 

It does not prove that (1) must be the general equation 
of frequency curves. Prof. Edgeworth expressly says 
that it is not (l.c. p. 8). Nor does the theory, and this is 
the all important point, connect these curves in' any way 
with the causes, which give rise to them. 

The general theory involves the solution of this problem 
(and its reverse): 

„On certain quantities x, which at starting are equal, 
„ there come to operate certain causes of deviation, the 
„effect of which depends in a given way on the value 
„of x. What will be the frequency-curve produced?” 

It is this problem which I treated in my paper and of 
which the general solution is given p. 15—16. It leads to 
the identical equation (1), when the effect of the causes 
is proportional to 

1 

F' ( x) 


The difference in the significance of the result, however, 
is evident. 
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Prof. Pearson overlooks the difference. He completely 
ignores, the general problem which constitutes the real 
subject of my paper and says (p. 199). „He(i. e. Kapteyn) 
^assumes that some quantity obeys the normal distribution” 
whereas there is no question of such an assumption either 
in the enunciation of the problem or in its solution. 

I am sorry to state that this is not the only inexact 
representation of the contents of my „Skew Curves”. This 
is particularly disappointing in a paper which shows good 
evidence of the fact that the author has largely profited 
by the exposition of the theory which he refutes. 

After perusing this refutation I strongly felt that it 
would be right to abstain from any reply, safe that on 
the question of priority. 

Any trained mathematician would, without difficulty, 
judge for himself. 

After a while, however, I came to consider that natura¬ 
lists and most of the other persons mainly interested in 
the matter, can hardly be expected, as a rule, to be suffi¬ 
ciently well trained in mathematics to see for themselves 
were the truth lies. Thus real advantage might be gained 
by not letting the matter rest. 

It is this consideration that made me resolve, and this 
brings me to my second point, to devote at least a few 
lines to a direct reply to the criticisms brought forward 
against my theory. 

For the purpose in view, however, no detailed reply is at 
all necessary. It will be sufficient to show: 

I. That Prof. Pearson actually adopts my theory 
(which he refutes) as the only rigorous and general one; 

II. That Pearson’s formulae, even now that he has 
tried to derive them from our equation (1) may, at the 
very best, be accepted as empirical representations. 

These statements must seem startling. Still nothing is easier 
than to show their correctness; in fact Prof. Pearson 
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has gone to great pains in destroying his own theory. 

[In what follows the pages quoted refer to Prof. Pear¬ 
sons paper in parts I and II of the present volume of 
this Journal]. 

On page 210 (and several other places) the equation: 

1^ dy _ — x 

(2) * *’“«.•/(|) 

is stated by Prof. Pearson to represent the generalised 
^probability curve for an infinite number of cause groups”. l ) 
On page 211 again he asserts that all discussion of asym¬ 
metrical frequency must turn on this equation. Only, in 
accordance with p. 178, it is here written, with a slightly 
different notation 

I 

,o v 1 dy _ — x 
' ' y dx o* F(x) 

According to Prof. Pearson this equation leads at once 
to his (Prof. Pearson’s) generalised probability curves 

by expanding f (j?j in a series of ascending porwers of 

(p. 210, 211) „A very few terms of the expansion, 

„however, suffice for describing practical frequency distri¬ 
bution” (p. 211). According to p. 204 and 212Prof. Pear¬ 
son’s curves stop at three terms, in fact he puts (p. 204 
and 212) 

(4) F(z) = a 0 -ha I z + a,x\ 

Now this equation (2) or (3), which thus is stated to be 


1) The express condition of very numerous causes of deviation 
has been adhered to throughout in my .skew curves”. Considerations 
based on a supposed very restricted number of causes can be easily 
shown to be illusory in nearly every case of asymmetric frequency. 
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the true general equation of the frequency curves is not 
Prof. Pearson’s equation, but simply the equation of 
the curves of Edge wort h-Kapteyn '), The identity is 
only hidden by the fact that it is the differential equation, 
whereas I derived at once the equation in its finite form. 

Everybody may convince himself of the fact by simply 
differentiating equation (1). In order to accomodate to Prof. 
Pearsons notation in equation (3) he has only to substitute: 

f{x) for F(x) — M r ) 



so that this equation becomes: 

(5) y = 0 -^ r (X) e ~5T' 

and further to introduce Prof. Pearsons abbreviation 
(p. 178) 

(6) F(x )—__ x f l 2 (x) _ 

This proves point I. . 

As to point II. 

One would naturally imagine that, if it be true, as 
shown just now, that Prof. Pearson derives his own 
curves from those given by myself, both curves must be 
identical; the only possible difference being that my for¬ 
mulae must be rigorous, whereas Prof. Pearson’s, in 
which only a few terms of Maclaurin’s series are used, 
must be only more or less approximate. 

1) If Prof. Edgeworth has no objection I will gladly adopt 
this denomination applied to them by Prof. Pearson. 

2) This does not mean, as Prof. P e a r s o n erroneously supposes, 
that we choose the mode as the origin (p. 178). 
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Aa a matter of fact this identity, or approximate iden¬ 
tity, will not exist at all but in a few very exceptional 
cases. 

As a consequence thereof Pearsons formulae will lose 
their rational character. 

The reason is that to substitute the expression 
(7) a. 4- a ,x ■+- a, x* 

for F (x) and for a long range of values of x, is permis¬ 
sible (even as an approximation) only in quite excep¬ 
tional cases. 

If it be permissible to substitute the expression (7) for 
F (x), why not for 

x 

F (x) 

which would make the equation (3) still simpler, or, sim¬ 
plest of all, why not take (7) for the ordinates of the 
frequency-curves themselves. 

The only possible answer is, that experience shows that 
Pearsons assumption leads to equations which can be 
made to represent tolerably a great number of observed 
frequency-curves, whereas the other assumptions do not. 

But this is equivalent to admitting that Pearsons 
curves are purely empirical ; which is just what I maintain. 

It settles Point II. ’) 

In Conclusion. As Prof. Pearson now derives his own 
theory from mine, it need not be said that every objection 
raised by him against my general theory bears directly on 
his own. 

Of the objections contained in his paper against the 
special case (causes proportional to some power of x -+- *) 


1) The same reasoning still holds of course in the case that 
more terms of a Maclaurin expansion are included. 
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more fully developed by me, some are as astonishing 
as the points here treated. I will say nothing about 
them, however, though I do not admit the validity of a 
single one of them. This only may be pointed out, that 
Prof. Pearsons statement (p. 178) that this form „has 
„been suggested by Kapteyn as a general form of the 
„Skew frequency-curves” is erroneous.') 

Quite recently some cases have been submitted to me 
which are evidently not contained in my special form. 
To meet such cases I have developed the general theory 
somewhat more fully in a paper now ready for press. 

Groningen, Januari 1906. 


1) See for instance pp. 18 and 29 of my paper. 



